Statistical Model-Based Voice Activity Detection Based on Second-Order Conditional MAP with Soft Decision
نویسنده
چکیده
© 2012 ETRI Journal, Volume 34, Number 2, April 2012 In this paper, we propose a novel approach to statistical model-based voice activity detection (VAD) that incorporates a second-order conditional maximum a posteriori (CMAP) criterion. As a technical improvement for the first-order CMAP criterion in [1], we consider both the current observation and the voice activity decision in the previous two frames to take full consideration of the interframe correlation of voice activity. This is clearly different from the previous approach [1] in that we employ the voice activity decisions in the second-order (previous two frames) CMAP, which has quadruple thresholds with an additional degree of freedom, rather than the first-order (previous single frame). Also, a softdecision scheme is incorporated, resulting in time-varying thresholds for further performance improvement. Experimental results show that the proposed algorithm outperforms the conventional CMAP-based VAD technique under various experimental conditions.
منابع مشابه
Toward detecting voice activity employing soft decision in second-order conditional MAP
In this paper, we propose a novel approach to statistical modelbased voice activity detection (VAD) that incorporates a secondorder conditional maximum a posteriori (MAP) criterion. As a technical improvement for the first-order conditional MAP criterion in [1], we consider both the current observation and the voice activity decision in the previous two frames to take full consideration of the ...
متن کاملVoice activity detection based on conditional MAP criterion incorporating the spectral gradient
In this paper, we propose a novel approach to improve a statistical model-based voice activity detection (VAD) method based on a modified conditional maximum a posteriori (MAP) criterion incorporating the spectral gradient scheme. The proposed conditional MAP incorporates not only the voice activity decision in the previous frame as in [1] but also the spectral gradient of the observed spectra ...
متن کاملA voice activity detector employing soft decision based noise spectrum adaptation
In this paper, a voice activity detector (VAD) for variable rate speech coding is decomposed into two parts, a decision rule and a background noise statistic estimator, which are analysed separately by applying a statistical model. A robust decision rule is derived from the generalized likelihood ratio test by assuming that the noise statistics are known a priori. To estimate the time-varying n...
متن کاملEndpoint detection using weighted finite state transducer
In this paper, we discuss the possibility of applying weighted finite state transducer (WFST) as a unified framework to solve endpoint detection problem. In general, endpoint detection is composed of two cascaded decision processes. The first process is voice activity detection (VAD) which makes framelevel speech/non-speech classification. The second process is utterance-level detection which m...
متن کاملImproved Global Soft Decision Incorporating Second-Order Conditional MAP in Speech Enhancement
In this paper, we propose a novel method based on the second-order conditional maximum a posteriori (CMAP) to improve the performance of the global soft decision in speech enhancement. The conventional global soft decision scheme is found through investigation to have a disadvantage in that the global speech absence probability (GSAP) in that scheme is adjusted by a fixed parameter, which could...
متن کامل